An Approximate Lp-Difference Algorithm for Massive Data Streams

نویسندگان

  • Jessica H. Fong
  • Martin Strauss
چکیده

Several recent papers have shown how to approximate the difference ∑i |ai−bi| or ∑ |ai−bi| between two functions, when the function values ai and bi are given in a data stream, and their order is chosen by an adversary. These algorithms use little space (much less than would be needed to store the entire stream) and little time to process each item in the stream. They approximate with small relative error. Using different techniques, we show how to approximate the Lp-difference ∑i |ai− bi| for any rational-valued p ∈ (0,2], with comparable efficiency and error. We also show how to approximate ∑i |ai−bi| for larger values of p but with a worse error guarantee. Our results fill in gaps left by recent work, by providing an algorithm that is precisely tunable for the application at hand.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Approximate L1-Difference Algorithm for Massive Data Streams

We give a space-efficient, one-pass algorithm for approximating the L1 difference Pi jai bij between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families fVjg of limited-independence random variables that are range-summable, by which we mean that Pc 1 j=0 Vj(s) i...

متن کامل

An Approximate L-Difference Algorithm for Massive Data Streams

Massive data sets are increasingly important in a wide range of applications, including observational sciences, product marketing, and monitoring and operations of large systems. In network operations, raw data typically arrive in streams, and decisions must be made by algorithms that make one pass over each stream, throw much of the raw data away, and produce “synopses” or “sketches” for furth...

متن کامل

An Approximate Lp-Di erence Algorithm for Massive Data Streams

Several recent papers have shown how to approximate the diierence P i jai ? bi j or P jai ? bi j 2 between two functions, when the function values ai and bi are given in a data stream, and their order is chosen by an adversary. These algorithms use little space (much less than would be needed to store the entire stream) and little time to process each item in the stream and approximate with sma...

متن کامل

Streaming Algorithms for Distributed, Massive Data Sets

Massive data sets are increasingly important in a wide range of applications, including observational sciences, product marketing, and monitoring and operations of large systems. In network operations, raw data typically arrive in streams, and decisions must be made by algorithms that make one pass over each stream, throw much of the raw data away, and produce \synopses" or \sketches" for furth...

متن کامل

Fast Mining of Massive Tabular Data via Approximate Distance Computations

Tabular data abound in many data stores: traditional relational databases store tables, and new applications also generate massive tabular datasets. For example, consider the geographic distribution of cell phone traffic at different base stations across the country or the evolution of traffic at Internet routers over time . Detecting similarity patterns in such data sets (e.g., which geographi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Discrete Mathematics & Theoretical Computer Science

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2000